%% CHEBFUN GUIDE 6: QUASIMATRICES AND LEAST-SQUARES
% Lloyd N. Trefethen, November 2009, revised Feburary 2011

%% 6.1  Quasimatrices and "spy"
% A chebfun can have more than one column, or if it is transposed, it can
% have more than one row.  In these cases we get a
% *quasimatrix*, a "matrix" in which one of the dimensions is
% discrete as usual but the other is continuous.  Our default choice
% will be that of an "infinity by n" quasimatrix consisting of n columns,
% each of which is a chebfun.  When it is important to specify the orientation
% we use the term *column quasimatrix* or *row quasimatrix*.

%%
% Here for example is the quasimatrix consisting of the first
% six powers of x on the interval [-1,1].  The command "size" can be
% used to identify the continuous dimension, and to find the numbers
% of rows or columns:

  x = chebfun('x');
  A = [1 x x.^2 x.^3 x.^4 x.^5];
  size(A)
  size(A,2)

%%
% Here is the third column of A evaluated at the point x=0.5:

  A(0.5,3)

%%
% Here are the column sums, i.e., the integrals of 1, x,..., x^5
% from -1 to 1:

  format short, sum(A), format long

%%
% And here is a plot of the columns:

  plot(A), grid on

%%
% The term quasimatrix comes from [Stewart 1998], and the
% same idea appears with different terminology in [de Boor 1991] and
% [Trefethen & Bau 1997, pp. 52-54].  The idea is a natural one,
% since so much of applied linear algebra
% deals with discrete approximations to the continuous, but it seems not to have
% been discussed explicitly very much until the appearance of
% Chebfun [Battles & Trefethen 2004, Battles 2006].

%%
% If f and g are column chebfuns, then f'*g is a scalar, their inner product.
% For example, here is the inner product of x^2 and x^4 over [-1,1] (equal to 2/7):

  A(:,3)'*A(:,5)

%%
% More generally, if A and B are column quasimatrices with m and n columns, respectively,
% then A'*B is the m x n matrix of inner products of those columns.  Here is the
% 6x6 example corresponding to B=A:

  format short, A'*A, format long

%%
% You can get an idea of the shape of a quasimatrix with the overloaded SPY command

  subplot(1,2,1), spy(A), title A
  subplot(1,2,2), spy(A'), title('A''')

%% 6.2 Backslash and least-squares

%%
% In Matlab, the command c = A\b computes the solution to the system
% of equations Ac = b if A is a square matrix, whereas if A is rectangular,
% with more rows than columns, it computes the least squares solution,
% the vector c that minimizes norm(A*c-b).  A quasimatrix is always rectangular, and
% "\" has accordingly been overloaded
% to carry out the appropriate continuous least-squares computation.
% (The actual Matlab command that handles backslash is called mldivide.)

%%
% For example, continuing with the same chebfun x and
% quasimatrix A as above, consider the following sequence:

  f = exp(x).*sin(6*x);
  c = A\f
  
%%
% The vector c can be interpreted as the vector of
% coefficients of the least-squares fit to f by a linear combination
% of the functions 1, x,..., x^5.  Here is a plot of f (in blue) and
% the least-squares approximation (in red), which we label ffit.

  ffit = A*c;
  clf, plot(f), grid on, hold on
  plot(ffit,'r'), hold off
  error = norm(f-ffit)

%%
% It is a general result that the least-squares approximation 
% by a polynomial of degree n to a continuous function f must intersect
% f at least n+1 times in the interval of approximation.
%
%%
% Here is quite a different quasimatrix whose columns can be used to fit f.
% The columns correspond to hat functions located at points equally
% spaced from -1 to 1, and they are realized as piecewise smooth chebfuns.

  A2 = [];
  for j = 0:8
    xj = -1 + j/4;
    A2 = [A2 max(0,1-4*abs(x-xj))];
  end
  plot(A2)
  set(gca,'xtick',-1:.25:1)

%%
% A linear combination of these columns is a piecewise linear function
% with breakpoints at -0.75, -0.50,...,0.75.  Here is the least-squares
% fit by such functions to f.  Remember that although we happen to be
% fitting here by a function with a discrete flavor, all the operations
% are continuous ones involving integrals, not point evaluations.

  c = A2\f;
  ffit = A2*c;
  plot(f), grid on, hold on
  plot(ffit,'.-r'), hold off
  set(gca,'xtick',-1:.25:1)
  error = norm(f-ffit)
    

%% 6.3 QR factorization

%%
% Matrix least-squares problems are ordinarily solved by QR factorization,
% and in the quasimatrix case, they are solved by quasimatrix QR factorization.
% This is the technology underlying the backslash operator described in the
% last section.

%%
% A quasimatrix QR factorization takes this form:
%
%        A = QR,  where A is inf x n,  Q is inf x n,  R is n x n.
%
% The columns of A are arbitrary, the columns of Q are orthonormal, and
% R is an n x n upper-triangular matrix.  This factorization corresponds
% to what is known in various texts as the "reduced", "economy size", "skinny",
% "abbreviated", or "condensed"
% QR factorization, since Q is rectangular rather than square and R is
% square rather than rectangular.  In Matlab the syntax for computing such things is
% [Q,R] = qr(A,0), and the same command has been overloaded for chebfuns.  The
% computation makes use of a quasimatrix analogue of Householder triangularization
% [Trefethen 2010].  Alternatively one can simply write [Q,R] = qr(A):

  [Q,R] = qr(A);
  plot(Q), grid on

%%
% The spy command confirms the shape of these various matrices.
% One sees fewer dots in the spy plot than one may expect at first,
% the reason being that half its entries in the upper-triangle should
% be zero because of the fact that
% the columns of A alternate even and odd functions.
% In fact, because of rounding or truncation errors, not all those mathematical
% zeros are zero on the computer, so the upper half of
% spy (R) appears as an imperfect checkerboard.

  subplot(1,3,1), spy(A), title A
  subplot(1,3,2), spy(Q), title Q
  subplot(1,3,3), spy(R), title R

%%
% The plot shows *orthogonal polynomials*, namely the orthogonalizations of
% the monomials 1, x,...,x^5 over [-1,1].  These are the famous Legendre
% polynomials {P_k} [Abramowitz & Stegun 1972],
% except that the latter are conventionally normalized by the
% condition P(1) = 1 rather than by having norm 1.  We can renormalize to
% impose this condition as follows:

  for j = 1:size(A,2)
    R(j,:) = R(j,:)*Q(1,j);
    Q(:,j) = Q(:,j)/Q(1,j);
  end
  clf, plot(Q), grid on

%%
% (A slicker way to produce this plot in Chebfun would be
% simply to type plot(legpoly(0:5)).)
  
%%
% If A=QR, then A*inv (R) = Q, and here is inv (R):
  format short, inv(R), format long

%%
% The kth column of inv (R) is the vector of coefficients for the 
% expansion of the kth column of Q as a linear combination of the
% columns of A, that is, the monomials 1, x, x^2,....  In other words,
% the kth column of inv (R) is the vector of coefficients of the kth
% Legendre polynomial.  For example, we see from the matrix
% that P_3(x) = 2.5x^3 - 1.5x.

%%
% Here is what the hat functions look like after orthonormalization:

  [Q2,R2] = qr(A2);
  plot(Q2)

%% 6.4 SVD, norm, cond

%%
% An m x n matrix A defines a map from R^n to R^m, and in particular, A maps the unit ball in
% R^n to a hyperellipsoid of dimension <=n in R^m.  The (reduced, skinny, condensed,...)
% *SVD* or *singular value decomposition* exhibits this
% map by providing a factorization  AV = US or equivalently
% A = USV', where U is m x n with orthonormal columns, S is diagonal with
% nonincreasing nonnegative diagonal entries known as the *singular values*,
% and V is n x n and orthogonal.
% A maps v_j, the jth column of V or jth *right singular vector*, to s_j times
% u_j, the jth column of U or jth *left singular vector*, which is the vector
% defining the jth largest semiaxis of the hyperellipsoid.  See Chapters 4 and 5
% of [Trefethen & Bau 1997].

%%
% If A is an inf x n quasimatrix, everything is analogous:
%
%      A = USV',  where A is inf x n,  U is inf x n,  S is n x n,  V is n x n.
%
% The image of the unit ball in R^n under A is still a hyperellipsoid of dimension <=n, which now
% lies within an infinite-dimensional function space.
% The columns of Q are orthonormal functions and S and V have the same properties
% as in the matrix case.

%%
% For example, here are the singular values of the matrix
% A defined earlier with columns 1,x,...,x^5:

  s = svd(A,0)

%%
% The largest singular value is equal to the norm of the quasimatrix, which
% is defined by  norm(A,2) = max_x norm(A*x)/norm(x).

  norm(A,2)

%%
% (Note that we must include the argument "2" here: for reasons of speed,
% the default for quasimatrices, unlike the usual Matlab matrices, is
% the Frobenius norm rather than the 2-norm.)
% The SVD enables us to identify exactly what vectors are involved in achieving
% this maximum ratio.  The optimal vector x is v1, the first right singular vector of A,

  [U,S,V] = svd(A);

%%
% First we use spy to confirm the shapes of the matrices.  As with spy (R) earlier, here
% spy(V) should in principle show a checkerboard, but some of its blanks are turned into
% dots by rounding or truncation errors.

  subplot(1,4,1), spy(A), title A
  subplot(1,4,2), spy(U), title U
  subplot(1,4,3), spy(S), title S
  subplot(1,4,4), spy(V), title V
  v1 = V(:,1)

%%
% Next we confirm that the norm of v1 is 1:

  norm(v1)

%%
% This vector is mapped by A to the chebfun s1*u1:

  u1 = U(:,1);
  norm(u1)

%%

  s1 = S(1,1)

%%
  norm(A*v1)

%%
  norm(A*v1-s1*u1)


%%
% Similarly, the minimal singular value and corresponding singular vectors
% describe the minimum amount that A can enlarge an input.  The following
% commands plot the extreme functions A*v1 (blue) and A*vn (red).  We can 
% interpret these as the largest and smallest degree-5 polynomials, as
% measured in the 2-norm over [-1,1], whose coefficient vectors have 2-norm equal to 1.

  clf, plot(A*v1), grid on, hold on
  vn = V(:,end);
  plot(A*vn,'r'), hold off

%%
% The ratio of the largest and smallest singular values -- the
% eccentricity of the hyperellipsoid -- is the condition number of A:

  max(s)/min(s)

%%
  cond(A)

%%
% The fact that cond(A) is a good deal greater than 1 reflects the ill-conditioning
% of the monomials 1,x,...,x^5 as a basis for polynomials in [-1,1].  The effect
% becomes rapidly stronger as we take more terms in the sequence:

  cond([1 x x.^2 x.^3 x.^4 x.^5 x.^6 x.^7 x.^8 x.^9 x.^10])

%%
% By contrast a quasimatrix formed of suitably normalized Legendre polynomials has
% condition number 1, since they are orthonormal:
cond(legpoly(0:10,'norm'))

%%
% A quasimatrix of Chebyshev polynomials doesn't quite achieve
% condition number 1, but it comes close:
cond(chebpoly(0:10))

%%
% Chebyshev polynomials form an excellent basis for expansions on [-1,1], a fact
% that is the starting point of Chebfun.

%% 6.5 Other norms

%%
% The definition  norm(A) = max_x norm(A*x)/norm(x)
% makes sense in other norms besides the 2-norm, and the particularly 
% important alternatives are the 1-norm and the inf-norm.  The 1-norm of
% a column quasimatrix is the "maximum column sum", i.e., the maximum of
% the 1-norms of its columns.   In the case of our quasimatrix A,
% the maximum is attained by the first column, which has norm 2:

  norm(A,1)

%%
% The inf-norm is the "maximum row sum", which for a column quasimatrix
% corresponds to the maximum of the chebfun obtained by adding the
% absolute values of the columns.  In the case of A, the sum is
% 1+|x|+...+|x|^5, which attains its maximum value 6 at x=-1 and 1:

  norm(A,inf)

%%
% The norms of row quasimatrices are analogous, with
% norm(A',inf) = norm(A,1) and norm(A',1) = norm(A,inf).
% Like Matlab itself applied to a rectangular matrix, Chebfun
% does not define cond(A,1) or cond(A,inf) if A is a quasimatrix.

%%
% The Frobenius or Hilbert-Schmidt norm is equal to the square root of the sum of the
% squares of the singular values:

  norm(A,'fro')

%% 6.6 rank, null, orth, pinv

%%
% Chebfun also contains overloads for some further
% Matlab operations related to orthogonal matrix factorizations.
% Perhaps the most useful of these is rank(A), which computes the singular
% values of A and makes a judgement as to how many of them are significantly
% different from zero.  For example, with x still defined as before, here is an
% example showing that the functions 1, sin(x)^2, and cos(x)^2 are linearly dependent:

  B = [1 sin(x).^2 cos(x).^2];
  rank(B)

%%
% Since B is rank-deficient, is has a nontrivial nullspace, and the command
% null(B) will find an orthonormal basis for it:

  null(B)

%%
% Similarly the command orth(B) finds an orthonormal basis for the range of B,
% which in this case has dimension 2:

  orth(B)

%%
% If A is an inf x n column quasimatrix, the command pinv(A) computes the pseudoinverse
% of A, an n x inf row quasimatrix such that pinv(A)*c = A\c. 

%%
% Here is a summary of the dimensions of these objects:

  subplot(1,3,1), spy(null(B)), title null(B)
  subplot(1,3,2), spy(orth(B)), title orth(B)
  subplot(1,3,3), spy(pinv(A)), title pinv(A)


%% 6.7  References
% 
% [Abramowitz & Stegun 1972]
% M. A. Abramowitz and I. A. Stegun, eds., Handbook of Mathematical Functions with
% Formulas, Graphs, and Mathematical Tables, 9th printing, Dover, 1972.
%
% [Battles 2006] Z. Battles, Numerical Linear Algebra for
% Continuous Functions, DPhil thesis, Oxford University
% Computing Laboratory, 2006.
%
% [Battles & Trefethen 2004] Z. Battles and L. N. Trefethen,
% "An extension of Matlab to continuous functions and
% operators", SIAM Journal on Scientific Computing 25 (2004),
% 1743-1770.
%
% [de Boor 1991] C. de Boor, "An alternative approach to (the teaching
% of) rank, basis, and dimension", Linear Algebra and its Applications
% 146 (1991), 221-229.
%
% [Stewart 1998] G. W. Stewart, Afternotes Goes to Graduate School:
% Lectures on Advanced Numerical Analysis, SIAM, 1998.
%
% [Trefethen 2010] L. N. Trefethen, "Householder triangularization of
% a quasimatrix", IMA Journal on Numerical Analysis 30 (2010), 887-897.
%
% [Trefethen & Bau 1997] L. N. Trefethen and D. Bau, III, Numerical Linear
% Algebra, SIAM, 1997.

